Workflow: Code Style, Data Tidying, Workflow: Scripts and Projects, Data Import, Workflow: getting help

Module 03

Ray J. Hoobler

Workflow: Beyond the Basics

Libraries Used in this Presentation

Code
library(tidyverse)
library(palmerpenguins)

Workflow: code style

Names

Variable names (those created by <- and those created by mutate()) should use only lowercase letters, numbers, and _.

Use _ to separate words within a name.

Code
body_mass_mean <- mean(penguins$body_mass_g, na.rm = TRUE)
body_mass_mean
[1] 4201.754


Code
body_mass_sd <- sd(penguins$body_mass_g, na.rm = TRUE)
body_mass_sd
[1] 801.9545

Tip

Use “long, descriptive names that are easy to understand rather than concise names that are fast to type.”

Spaces (R4DS 2e 4.2)

Put spaces on either side of mathematical operators apart from ^ (i.e. +, -, ==, <, …), and around the assignment operator (<-).

```{r}
#| code-fold: show
# Strive for
z <- (a + b)^2 / d

# Avoid
z<-( a + b ) ^ 2/d
```

Don’t put spaces inside or outside parentheses for regular function calls. Always put a space after a comma, just like in standard English.

```{r}
#| code-fold: show
# Strive for
mean(x, na.rm = TRUE)

# Avoid
mean (x ,na.rm=TRUE)
```

Note

Python code style guide: PEP 8, has similar recommendations for spaces; however, it recommends not using space around the = sign when used to indicate a keyword argument or a default parameter value.

Spaces (R3DS 2e 4.2, cont.)

```{r}
flights |> 
  mutate(
    speed      = distance / air_time,
    dep_hour   = dep_time %/% 100,
    dep_minute = dep_time %%  100
  )
```
```{r}
flights |>
  mutate(
    speed = distance / air_time,
    dep_hour = dep_time %/% 100,
    dep_minute = dep_time %% 100
  )
```


Using line returns (after the comma) to separate arguments in a function call is a good practice.

Pipes

ggplot2

Sectioning comments

Exercises

Data Tidying

Introduction

Happy families are all alike; every unhappy family is unhappy in its own way.
— Leo Tolstoy

Tidy datasets are all alike, but every messy dataset is messy in its own way.
— Hadley Wickham

Tidy Data

Lengthening Data

Widening Data

Workflow: scripts and projects

Scripts

Projects

Data import

Reading Data from a File

Controlling Column Types

Reading Data from Multiple Files

Writing to a File

Data Entry

tibble() (aka a data frame)

tribble() (transposed tibble)

Data Entry

Workflow: getting help

Google

Stack Overflow

GenAI (ChatGPT, Claude, GitHub Copilot, etc.)

End of Module 3

References